ElevenLabs Scribe v2 Tops Artificial Analysis AA-WER 2.0 Speech-to-Text Benchmark

Details: By Alex Rowland; Category: Models; 12 h; 9

Artificial Analysis has released version 2.0 of its AA-WER speech-to-text benchmark, which measures the accuracy of speech recognition models. In the overall ranking, ElevenLabs’ Scribe v2 takes first place with a word error rate of just 2.3%.

Second and third place go to Google’s Gemini 3 Pro at 2.9% and Mistral’s Voxtral Small at 3.0%, respectively. Other strong performers include Google Gemini 3 Flash at 3.1% and ElevenLabs Scribe v1 at 3.2%. In the middle of the pack are models such as OpenAI’s GPT-4o Transcribe at 4.0% and Whisper Large v3 at 4.2%. Toward the lower end of the ranking are Alibaba’s Qwen3 ASR Flash at 5.9%, Amazon Nova 2 Omni at 6.0%, and Rev AI at 6.1%.

ElevenLabs Scribe v2 leads the overall AA-WER v2.0 benchmark ranking with the lowest word error rate, followed by Google Gemini 3 Pro and Mistral Voxtral Small. | Image: Artificial Analysis

In a separate benchmark focused specifically on speech directed at voice assistants, the overall picture remains largely the same. Scribe v2 again leads with a word error rate of 1.6%, followed closely by Gemini 3 Pro at 1.7%. AssemblyAI’s Universal-3 Pro ranks third with 2.3%.

In the AA-AgentTalk test for speech on voice assistants, Scribe v2 from ElevenLabs and Gemini 3 Pro from Google also dominate with the lowest error rates. | Image: Artificial Analysis

About The Hosts

Alex Rowland

AI Industry Analyst

Is an AI industry analyst covering major AI platforms, enterprise adoption, and strategic moves by Big Tech companies. His work focuses on how AI systems are deployed at scale and how they reshape products, markets, and user behavior.

AI News

Accenture Tracks AI Tool Usage and Ties Adoption to Promotions

Adobe Firefly Introduces Unlimited AI Image and Video Generation for Subscribers

AGI May Arrive by 2026–2027, Warns Anthropic CEO Dario Amodei

AI Agent Beats 804 Human Programmers in Major Coding Tournament

AI & Society

AI Agents Create a Lobster Religion on Moltbook

AI Could Trigger a Major U.S. Economic Crisis by 2028, Citrini Research Warns

Amazon Launches Health AI Assistant in One Medical App

Apple Accelerates AI Wearables: Smart Glasses, Pendant, and AI-Powered AirPods

AI Insights

Adobe Reinvents Document Work with Acrobat Studio and AI

AI as a Role Model for Generation Alpha: Promise, Risks, and the Future of Childhood

AI as a Toy: Why Humanity Always Misuses New Technology First

AI as On-Chain Judge: Stanford Professor Proposes Using LLMs to Resolve Prediction Market Disputes

ElevenLabs Scribe v2 Tops Artificial Analysis AA-WER 2.0 Speech-to-Text Benchmark

About The Hosts

More From Alex Rowland

Policy & Security

China’s PLA Rapidly Expands Military AI Testing, From Drone Swarms to Deepfake Warfare

Policy & Security

Anthropic vs Pentagon: AI, Surveillance Red Lines, and the Limits of War AI

Work

AI Could Trigger a Major U.S. Economic Crisis by 2028, Citrini Research Warns

Explainers

Inception Launches Mercury 2, a Diffusion-Based Reasoning AI Model

Policy & Security

Deepseek Allegedly Trained AI Model on Nvidia Blackwell Chips

Platforms

Meta, AMD Sign 6GW AI GPU Deal Focused on Inference

Education

Google Launches Free AI Training for 6M U.S. Teachers

Platforms

OpenAI’s $500B Stargate Data Center Project Faces Delays

Industria

MARA adquiere el 64% de la operadora francesa Exaion

Industry

Anthropic Expands Claude PowerPoint Tool to Pro Users

Categories

AI News

Categories

AI & Society

Categories

AI Insights

ElevenLabs Scribe v2 Tops Artificial Analysis AA-WER 2.0 Speech-to-Text Benchmark

About The Hosts

More From Alex Rowland